DTW-distance-ordered spoken term detection

نویسندگان

Teppei Ohno

Tomoyosi Akiba

چکیده

The amount of Web-based multimedia data that includes speech is increasing rapidly. Spoken term detection (STD) enables rapid identification of desired-information candidates from a large quantity of speech data. Considering that these STD candidates ultimately have to be checked one at a time by the user, a long list of candidates is not desirable. However, setting an appropriate cutoff threshold for a particular STD request beforehand is not easy. In this work, we propose a novel indexing and search method for STD that requires no cutoff threshold for detection but can output detection results in increasing order of their dynamic time warping (DTW) distances for a given query term. Our experimental evaluation showed that, whereas using the strict algorithm for our method gave detection results that were exactly in increasing order of their DTW distances, its relaxed variants were able to execute much faster at the cost of only a slight loss in the exact ordering.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DTW-Distance-Ordered Spoken Term Detection and STD-based Spoken Content Retrieval: Experiments at NTCIR-10 SpokenDoc-2

In this paper, we report our experiments at NTCIR-10 SpokenDoc-2 task. We participated both the STD and SCR subtasks of SpokenDoc. For STD subtask, we applied novel indexing method, called metric subspace indexing, previously proposed by us. One of the distinctive advantages of the method was that it could output the detection results in increasing order of distance without using any predefined...

متن کامل

Use of GPU and Feature Reduction for Fast Query-by-Example Spoken Term Detection

For query-by-example spoken term detection (QbE-STD) on low resource languages, variants of dynamic time warping techniques (DTW) are used. However, DTW-based techniques are slow and thus a limitation to search in large spoken audio databases. In order to enable fast search in large databases, we exploit the use of intensive parallel computations of the graphical processing units (GPUs). In thi...

متن کامل

Unsupervised spoken-term detection with spoken queries using segment-based dynamic time warping

Spoken term detection is important for retrieval of multimedia and spoken content over the Internet. Because it is difficult to have acoustic/language models well matched to the huge quantities of spoken documents produced under various conditions, unsupervised approaches using frame-based dynamic time warping (DTW) has been proposed to compare the spoken query with spoken documents frame by fr...

متن کامل

Utilizing state-level distance vector representation for improved spoken term detection by text and spoken queries

In spoken term detection (STD) systems, approximate subwordlevel matching of query term and automatically transcribed spoken documents is often employed for its reasonable accuracy and efficiency. However, high out-of-vocabulary (OOV) rate often degrades the subword-level recognition accuracy and affect the STD performance. This paper describes the usage of new expanded acoustic representations...

متن کامل

The LF Query-by-Example Spoken Term Detection system for the ALBAYZIN 2016 evaluation

Query-by-Example Spoken Term Detection (QbE-STD) is the task of finding occurrences of a spoken query in a repository of audio documents. In the last years, this task has become particularly appealing, mostly due to its flexibility that allows, for instance, to deal with lowresourced languages for which no Automatic Speech Recognition (ASR) system can be built. This paper reports experimental r...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

DTW-distance-ordered spoken term detection

نویسندگان

چکیده

منابع مشابه

DTW-Distance-Ordered Spoken Term Detection and STD-based Spoken Content Retrieval: Experiments at NTCIR-10 SpokenDoc-2

Use of GPU and Feature Reduction for Fast Query-by-Example Spoken Term Detection

Unsupervised spoken-term detection with spoken queries using segment-based dynamic time warping

Utilizing state-level distance vector representation for improved spoken term detection by text and spoken queries

The LF Query-by-Example Spoken Term Detection system for the ALBAYZIN 2016 evaluation

عنوان ژورنال:

اشتراک گذاری